home *** CD-ROM | disk | FTP | other *** search
-
- Assembler Tutorial
- ******************
-
- This chapter explains how to use the RISC OS Forthmacs ARM assembler
- in order to create short machine language code sequences. This
- chapter is a companion to the "ARM Assembler" chapter. That chapter
- describes the syntax of individual assembly language instructions.
- This chapter addresses "higher level" issues, such as how to begin and
- end the assembly process and how to communicate arguments and result
- between Forth and assembly language.
-
-
- Motivation
- ==========
-
- For nearly all debugging jobs, writing assembly language is
- unnecessary. Test loops can be usually be written more quickly and
- easily in high-level Forth, and will execute quickly enough to get the
- job done.
-
- However, in some cases the ultimate in speed is needed for certain
- critical operations, and assembly language may be the best way to go.
- In other cases, very specific combinations of machine instructions may
- exhibit problem behavior, and those combinations may need to be
- reproduced. Finally, some maintainers of the RISC OS Forthmacs system
- software itself may need to understand the assembler.
-
-
- Assumptions
- ===========
-
- The chapter assumes that you already understand the ARM instruction
- set, including such issues as processor modes, interrupts and
- registers sets. If not, you should first study a ARM reference, such
- as the manual published by the chip manufacturer.
-
- Please note the sysntax of this ARM assembler, it uses - as most Forth
- assemblers - the operand first - operator last syntax.
-
-
- Example: a simple "code word"
- =============================
-
- Here is a very simple assembly language program. It adds "1" to the
- contents of a register then returns to the Forth interpreter. This
- register r10 holds the top of stack value.
-
- code addone ( n -- n+1 )
- r10 r10 1 # add
- c;
-
- To execute it and display the result, you would type, for example,
-
- 5 addone .
-
- Here's what is happening, line by line:
-
- code addone ( n -- n+1 )
-
- CODE is a "defining word"; it creates a new command which can be
- executed by typing its name. The name of the new command in this case
- is ADDONE . The name could have been anything; I have chosen the name
- ADDONE because it describes the action of the program. You may
- already be familiar with another Forth defining word " : or COLON ".
- ":" also creates a new command; the difference between CODE and ":" is
- that ":" creates a new command whose behavior is described by a
- sequence of other Forth commands, whereas CODE creates a new command
- whose behavior is described by a sequence of assembly language
- instructions. After CODE creates the new command, it starts the
- assembler so that assembly language instructions may be entered.
-
- The stuff inside the parentheses is a comment; this particular comment
- indicates that the new command expects one argument ("n") on the stack
- before the word is executed, and after the command is executed, one
- result ("n+1") is left on the stack. The comment is optional, but its
- inclusion is strongly recommended.
-
- r10 r10 1 # add
-
- This is the assembly language instruction which defines the action of
- the new command. As you will recall from the "ARM Assembler" chapter,
- the RISC OS Forthmacs assembler syntax has the destination register
- first, followed by the source operand(s), followed by the operation
- name. So, in this case, the source operands are the global register
- r10 and the immediate number 1, the destination operand is the global
- register r10, and the operation is add, i.e. 1 is added to the
- contents of register r10, and the result is placed back in register
- r10.
-
- c;
-
- C; terminates the definition of a code definition. At the end of the
- instructions you have assembled, C; automatically appends one machine
- instruction, its effect is to return to Forth after the user-specified
- instructions have been executed.
-
- 5 addone .
-
- In order to invoke the new command, we enter the number 5 on the Forth
- stack, type the name of the command ADDONE , and then display the
- result by typing the print command "." .
-
- Perhaps you now wonder how the number got off the Forth stack and into
- the register r10, and afterwards how the number got out of r10 and
- back onto the Forth stack. The answer is simple: the top element of
- the Forth stack is always (!) kept in r10 , so no movement was
- necessary. That is why I chose r10 for the register in this example.
-
-
- Register Usage in Forth
- =======================
-
- To use the assembler effectively, you need to know which registers are
- available for use, and which of them must be left alone. Here are the
- rules:
-
- r8, r9, r12, and r14 are used internally by the Forth interpreter or
- operating system, their values must be left alone (otherwise the
- system will crash).
-
- r10 contains the top of the Forth stack. It is used for passing
- arguments and results back and forth between Forth and assembly
- language.
-
- r13 contains a pointer to a memory area containing the rest of the
- Forth stack (all elements other than the topmost one). That stack
- area is used for extra arguments and results. The section entitled
- "Stack Usage" tells you more about managing the stack area.
-
- r0 - r6 may be used freely within assembly language code sequences.
- Forth does not depend on the contents of these registers. However,
- some Forth commands DO use these registers as scratch registers, so
- your code should not attempt to keep important values in these
- registers from one time to the next. While your code is being
- executed, Forth will not change the contents of any of these
- registers, so you can depend on them for the duration of your assembly
- language sequence. When your code finishes and returns to Forth, the
- next time that you execute your code the register values may have
- changed.
-
- You can find more information about this subject in the "ARM
- Assembler" and "Forthmacs Implementation" chapters.
-
- While your machine code is executing, it will run at the full speed of
- the system, without any interference or overhead imposed by
- RISC OS Forthmacs. RISC OS Forthmacs does not itself use interrupts,
- so the processor will execute exactly the sequence of instructions
- which you have coded. It is possible that other software in the
- system may have set up some interrupts, but that is beyond the control
- of RISC OS Forthmacs.
-
-
- Disassembler
- ============
-
- The RISC OS Forthmacs disassembler may be used to review the assembly
- language you have created:
-
- see addone
-
- The result will look something like this:
- code addone
- ( 1e878 ) add r10,r10,#1
- ( 1e87c ) ldr pc,[r8],#4
-
-
- The numbers along the left hand side are the addresses at which the
- various instructions appear. The addresses shown here will almost
- certainly be different from the addresses that you see.
-
- You will notice that even though our example contained only one
- assembly language instruction the disassembler shows 1 extra
- instruction. This extra instruction was automatically assembled by
- the C; command. Their purpose is to return control to Forth after the
- assembly language sequence has finished its execution (this is called
- the NEXT instruction).
-
- The SEE command reads the name of a Forth command (in this case
- "addone"), determines what type of command it is (in this case "code
- ", meaning that the command's behavior was defined by the assembler),
- and then displays a reconstruction of the source code for that
- command. SEE also works for "colon" definitions, whose behaviour is
- defined in Forth instead of in assembly language. For an example of
- this, type "see find".
-
- Many of the normal Forth commands are defined in assembly language,
- and SEE can be used to look at how they are implemented. For example,
- type "see @" to see how the Forth "@" operator works (pronounced
- fetch, this operator takes an address from the top of the stack, reads
- the 32-bit contents of that address, and puts those contents back on
- top of the stack). You should try this right now and make sure you
- understand how it works. Note that the last instructions of "@" is
- exactly the same as the last instruction of "addone". Every code
- definition in RISC OS Forthmacs ends with these same three
- instructions.
-
- SEE automatically locates the address where the code for particular
- command begins. That address was allocated by CODE when the new
- command was defined. The disassembler can also be used to inspect
- machine code beginning at arbitrary addresses, not only that code
- which is created by CODE . Suppose that you know there is some code
- starting at address 100000 and you wish to look at it:
-
- 100000 dis
-
- On your system, this example probably won't work exactly as shown
- because your system may not have any code at address 100000 (in fact,
- it may not even have any memory there. The main point, though, is
- that you type the address of the code you wish to disassemble,
- followed by "dis".
-
- The disassembler will continue until it reaches a "definition ending"
- instruction, or until you stop it by typing the character "q", for
- "quit". It will also pause at the end of a screen and prompt you for
- a continuation character.
-
- After the disassembler has stopped, you can make it continue where it
- left off by typing +DIS
-
-
- Setting the Starting Address
- ============================
-
- In most cases, you won't need to specify a starting address for the
- code you assemble. When you use the CODE defining word to begin
- assembling, RISC OS Forthmacs will find some appropriate memory for
- you and assemble your code there ( at HERE). You can then locate the
- memory RISC OS Forthmacs has chosen by using the SEE command to
- disassemble the code, looking at the addresses displayed alongside the
- machine instructions.
-
- If you really need to assemble at a specific address, you can do so as
- follows (Note: in nearly all cases, this technique is unnecessary;
- very rarely does it matter where exactly you locate a bit of code, and
- allowing RISC OS Forthmacs to allocate the memory for you is
- sufficient and convenient).
-
- Set the DP by
- here @
- your-adr dp !
- code demo
- ...... c;
- here !
-
-
- Conditional branches
- ====================
-
- In order to implement conditional operations and loops, most
- assemblers provide branch instructions and labels. RISC OS Forthmacs
- has branches and labels too, but it also has a much better way, which
- eliminates most of the troublesome aspects of coding conditionals and
- loops in assembly language. The RISC OS Forthmacs way is called
- "structured conditionals". For example, suppose we want to test a
- condition and execute some code only if the condition is true.
- Specifically, we want to compare r0 and r1, and execute some code only
- if r0 is less than r1 .
-
- Traditional assembler:
-
- cmp r0, r1
- bge temp
- ..some code we want to conditionally execute
- temp:
-
- Forthmacs assembler with structured conditionals:
-
- r0 r1 cmp
- < if
- ..some code we want to conditionally execute..
- then
-
- As you can see, RISC OS Forthmacs eliminates the need to mentally
- reverse the sense of the comparison, eliminates the need to invent and
- keep track of label names, and uses conventional mathematical
- comparison symbols (e.g. "<"), rather then alphabetic mnemonics. The
- complete set of comparison symbols is given in the "ARM Assembler"
- chapter.
-
- The "if .. then" construct can also include an "else" clause:
-
- r0 r1 s cmp \ the s is optional
- < if
- ..code to execute if r0 < r1..
- else
- ..code to execute if r0 >= r1..
- then
-
- Of course, the assembler actually generates conditional branch
- instructions because that's what the hardware supports directly, but
- RISC OS Forthmacs takes care of the "bookkeeping" for you.
-
- Another way would be to use the conditional instructions offered by
- the ARM cpu.
-
- r0 r1 cmp
- xx xx lt xxx
- yy yy ge xxx
-
-
-
- Delayed Branches
- ================
-
- ARM doesn't uses delayed branches at all, so don't worry.
-
-
- Loops
- =====
-
- RISC OS Forthmacs structured conditionals also have features for
- easily creating loops. Here is a loop which executes forever:
-
- Source Generates
-
- begin Label1:
- top r0 ) ldr ldr r10,[r10,#0]
- again b Label1
-
- This code assumes that the r10 register (top of stack, remember?)
- contains the address of a memory location, and the contents of that
- memory location is continuously read into the r0 register. This is an
- infinite loop; it won't stop until the system is reset, or power
- cycled, or externally interrupted in some way.
-
- Suppose we want the loop to execute 9 times then quit:
-
- r1 9 # mov
- begin
- r0 top ) ldr
- r1 r1 1 # s sub
- <= until
-
-
- We continue to loop "until" r1 <= 1 .
-
- Finally, here's an example where we perform a test at the top of the
- loop rather than at the bottom, illustrating "while":
-
- r1 9 # mov
- begin
- r1 r1 1 s sub
- > while
- r0 top ) ldr
- repeat
-
-
- This loop continues to execute "while" r11 > 1, and the "repeat" sends
- it back to the "begin".
-
- Structured conditionals and loops nest in the expected manner, to an
- arbitrary depth. For instance, a "begin .. until" can be completely
- contained within an "if .. then", which itself may be contained
- within a "begin .. while .. repeat".
-
-
- Scope Loops - Assembler vs. Forth
- =================================
-
- You can use assembly language for creating scope loops, but it is
- usually preferable to write them in Forth, because the Forth version
- is usually easier to write, easier to read, and easier to debug. The
- one advantage of an assembly language loop is that it is tighter.
- However this rarely matters. For comparison, suppose that you want to
- continually read location 1000 so that you can observe the action on
- an oscilloscope. This is how you would do it in assembly language:
-
- code test
- r0 th 1000 # mov
- begin
- r1 r0 ) ldr
- again
-
- Here's how you would do the same thing in Forth:
-
- begin 1000 @ drop again
-
- Additionally, the Forth version may be easily adapted to stop looping
- as soon as a key is typed:
-
- begin 1000 @ drop key? until
-
- More importantly, many of today's complicated chips require fairly
- extensive initialization sequences in order to configure them to the
- correct operating mode. Such code is much easier to write and debug
- in Forth, because you can "try things out" by typing commands at the
- keyboard, the looking at the registers to see what happened.
-
- A set of simple Forth commands sufficient to do most hardware
- debugging jobs can easily be described on a single page, and many
- engineers and technicians have learned enough Forth in 30 minutes to
- be able to write sophisticated diagnostics for complicated hardware.
-
-
- Stack Usage
- ===========
-
- A previous example has shown how to access the top element on the
- stack which is stored in r10. Things get a little more complicated if
- more than 1 stack argument is needed. Remember that the top of the
- stack is stored in r10, and subsequent stack items are stored in a
- memory area whose address is contained in r13. For convenience, the
- assembler provides alternate names for r10 and r13, reflecting the use
- of these registers for the stack. r10 is also known as TOP (Top of
- Stack), and r13 is also known as SP (Stack Pointer).
-
- The basic rules for the Forth stack are:
-
- a) Upon entry to a CODE definition (assembly language), the top of the
- stack is contained in TOP. The next item on the stack is in the memory
- location whose address is contained in SP. The item after that is in
- memory at SP+4 , the next at SP+8 , etc. Note that successive stack
- items are 4 bytes (32-bits) apart.
-
- b) A definition may modify the stack contents, and upon exit from the
- definition the new top of the stack should be in TOP, and the next
- item should be in memory at that address contained in SP.
-
- c) Assembly code should not access memory at negative offsets from SP.
- This restriction safeguards against problems in an interrupt-driven
- environment, in case the same stack happens to be used for interrupt
- handlers.
-
- If items are removed from the stack by a code definition, care must be
- taken to make sure the correct top of stack value is left in TOP. Also
- remember that the RISC OS Forthmacs assembler provides macros to
- assist in managing the stack. Here are some examples; study them
- carefully:
-
- code and (s n1 n2 -- n3 )
- r0 sp pop
- top top r0 and c;
- code min (s n1 n2 -- n1|n2 )
- r0 sp pop
- top r0 cmp
- top r0 gt mov c;
- code drop (s n1 n2 -- n1 )
- top sp pop c;
- code dup (s n1 -- n1 n1 )
- top sp push c;
- code 1+ (s n -- n+1 ) top 1 incr c;
- code @ (s a_adr -- n )
- top top ) ldr c;
- \ a somewhat optimized fill
- code fill (s adr cnt char -- )
- r2 top top 8 #lsl orr
- r0 r1 top 3 sp ia! ldm \ r0-cnt r1-adr r2-data
- r0 4 # cmp
- gt if
- begin r3 r1 3 # s and
- r0 1 ne decr
- r2 r1 byte )+ ne str
- eq until
- r0 8 s decr
- r2 r2 r2 10 #lsl orr
- r3 r2 mov
- begin r2 r3 2 r1 ia! ge stm
- r0 8 ge s decr
- lt until
- r0 4 s incr
- r2 r1 )+ ge str
- r0 4 lt decr
- then
- begin r0 1 s decr
- r2 r1 byte )+ ge str
- lt until c;
- code >name \ (s cfa -- nfa )
- top 1 decr \ skip flag byte
- begin r0 top byte -( ldr
- r0 0 # cmp
- ne until
- begin r0 top byte -( ldr
- r0 20 # cmp
- lt until c;
-
-